Exploration via Model-based Interval Estimation

نویسندگان

  • Alexander L. Strehl
  • Michael L. Littman
چکیده

This paper takes an empirical approach to evaluating three model-based reinforcementlearning methods. All methods intend to speed the learning process by mixing exploitation of learned knowledge with exploration of possibly promising alternatives. We consider -greedy exploration, which is computationally cheap and popular, but unfocused in its exploration effort; R-Max exploration, a simplification of an exploration scheme that comes with a theoretical guarantee of efficiency; and a well-grounded approach, model-based interval estimation, that better integrates exploration and exploitation and achieves the best performance in our example tasks. Our experiments indicate that effective exploration can result in dramatic improvements in the observed rate of learning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interval Estimation for the Exponential Distribution under Progressive Type-II Censored Step-Stress Accelerated Life-Testing Model Based on Fisher Information

This paper, determines the confidence interval using the Fisher information under progressive type-II censoring for the k-step exponential step-stress accelerated life testing. We study the performance of these confidence intervals. Finally an example is given to illustrate the proposed procedures.

متن کامل

A Theoretical Analysis of Model-Based Interval Estimation: Proofs

Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Interval Estimation (MBIE) learns efficiently in practice, effectively balancing exploration and exploitation. This paper presents the first theoretical analysis of MBIE, proving its efficiency even under worst-case conditi...

متن کامل

A confidence-aware interval-based trust model

It is a common and useful task in a web of trust to evaluate the trust value between two nodes using intermediate nodes. This technique is widely used when the source node has no experience of direct interaction with the target node, or the direct trust is not reliable enough by itself. If trust is used to support decision-making, it is important to have not only an accurate estimate of trust, ...

متن کامل

Bayes Interval Estimation on the Parameters of the Weibull Distribution for Complete and Censored Tests

A method for constructing confidence intervals on parameters of a continuous probability distribution is developed in this paper. The objective is to present a model for an uncertainty represented by parameters of a probability density function.  As an application, confidence intervals for the two parameters of the Weibull distribution along with their joint confidence interval are derived. The...

متن کامل

Estimation of the mean grain size of mechanically induced Hydroxyapatite based bioceramics via artificial neural network

This study focuses on the estimation of the mean grain size of mechanically induced Hydroxyapatite (HA) through the artificial neural network (ANN) model. The mean grain size of HA and HA based nanocomposites at different milling parameters were obtained from previous studies. The data were trained and tested by the neural network modeling. Accordingly, all data (55 sets) were based on the mecha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004